Markovdecisionprocess(MDP)相关论文
This paper studies the optimal policy for joint control of admission,routing,service,and jockeying in a queueing sys-tem......
This paper investigates the guidance method based on reinforcement learning(RL)for the coplanar orbital interception in ......